100+ Projects
1500+ Hosts
100+ Build Variants
400k hours/month
Performance = 5% of that
(more in $$$)
Microbenchmarks
System performance test
EC2
(this talk)
c3.8xlarge, SSD
Repeatable results
(NOT max performance)
Assumption | True / False |
---|---|
Dedicated instance = more stable performance | Not tested |
Placement groups minimize network latency & variance | Not tested |
Different availability zones have different hardware | Seems False |
For write heavy tests, noise comes from disk | False |
Ephemeral (SSD) disks have least variance | False |
There are good and bad EC2 instances | False |
Just use i2 instances (better SSD) | False (True in theory) |
You can't use cloud for performance testing | False |
We tested many aspects of EC2 and our own system. To help you follow the presentation, I will reveal up front what were the assumptions made when the system was first built, and how the assumptions fared in our testing.
The rest of the presentation I will then share how we tested different EC2 configurations and came to these conclusions.
It's common to see engineers making design decision based on things they read on the internet. As you can see, our system included LOTS of them!! I call it witchcraft. Old wives tales, not based in science. The point of this presentation is that that is bad idea! There are no short cuts. Assume nothing. Measure everything.
noise = (max - min) / median
Goal is to minimize this single metric
There are good and bad EC2 instances | False |
(min - median - max)
for each test & thread level
mmapv1 left, wiredTiger right
insert_vector, insert_ttl, index_build
highest; jtrue lowest
Ephemeral (SSD) disks have least variance | False |
Remote EBS disks have unreliable performance | False (piops) |
Just use i2 instances (better SSD) | False (True in theory) |
i2.8xlarge has much more RAM, and wiredTiger cacheSizeGB default is 50% of RAM. This caused checkpointing issues not seen on c3.8xlarge. |
At this point we switched to c3.8xlarge + EBS PIOPS. |
For write heavy tests, noise comes from disk | False |
The above fio, cpu and iperf3 tests were themselves added to our daily CI tests
Kernel / user space context switches 30% slower on Jan 4, 2018
Assume nothing. Measure everything.
You can't use cloud for performance testing | False |
Image credits: l2f1 @ Flickr (CC BY) fran001 @ Flickr (CC BY) pinkmoose @ Flickr (CC BY) jo7ueb @ OpenClipart.org (PD) ivanlasso @ OpenClipart.org (PD)